AITopics | Vladivostok

Existing benchmarks for evaluating long-context language models (LCLMs) primarily focus on long-context recall, requiring models to produce short responses based on a few critical snippets while processing thousands of irrelevant tokens. We introduce LongProc (Long Procedural Generation), a new benchmark that requires both the integration of highly dispersed information and long-form generation. LongProc consists of six diverse procedural generation tasks, such as extracting structured information from HTML pages into a TSV format and executing complex search procedures to create travel plans. These tasks challenge LCLMs by testing their ability to follow detailed procedural instructions, synthesize and reason over dispersed information, and generate structured, long-form outputs (up to 8K tokens). Furthermore, as these tasks adhere to deterministic procedures and yield structured outputs, they enable reliable rule-based evaluation. We evaluate 17 LCLMs on LongProc across three difficulty levels, with maximum numbers of output tokens set at 500, 2K, and 8K. Notably, while all tested models claim a context window size above 32K tokens, open-weight models typically falter on 2K-token tasks, and closed-source models like GPT-4o show significant degradation on 8K-token tasks. Further analysis reveals that LCLMs struggle to maintain long-range coherence in long-form generations. These findings highlight critical limitations in current LCLMs and suggest substantial room for improvement. Data and code available at: https://princeton-pli.github.io/LongProc

benchmarking long-context language model, carol location, tweezers location, (13 more...)

arXiv.org Artificial Intelligence

2501.05414

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Sweden > Stockholm > Stockholm (0.05)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.05)
(33 more...)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Travel (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.92)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

North Korean troops in Ukraine 'fair game', US warns Russia as war rages on

Al JazeeraOct-24-2024, 09:06:18 GMT

United States defence secretary Lloyd Austin has waded in on reports that North Korea was preparing to enter the Ukraine war with troops. "If they are co-belligerents, if their intention is to participate in this war on Russia's behalf, that is a very, very serious issue," Austin said. Austin was returning from his fourth visit to Kyiv, where he announced a 400m package of US weapons for Ukraine. John Kirby, White House national security spokesman, said Washington believes that at least 3,000 North Korean soldiers arrived this month by sea to Vladivostok, Russia's largest Pacific port. "These soldiers then travelled onward to multiple Russian military training sites in eastern Russia, where they are currently undergoing training," Kirby said on Wednesday.

russia, russian force, ukraine, (14 more...)

Al Jazeera

Country:

Africa (0.30)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.26)
Asia > Russia > Far Eastern Federal District > Primorsky Krai > Vladivostok (0.26)
(11 more...)

Industry:

Government > Military (1.00)
Government > Regional Government > Europe Government > Russia Government (0.93)
Government > Regional Government > Asia Government > Russia Government (0.93)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.31)

Add feedback

A Russian Jeopardy! Data Set for Question-Answering Systems

Mikhalkova, Elena

arXiv.org Artificial IntelligenceOct-7-2024

Question answering (QA) is one of the most common NLP tasks that relates to named entity recognition, fact extraction, semantic search and some other fields. In industry, it is much appreciated in chatbots and corporate information systems. It is also a challenging task that attracted the attention of a very general audience at the quiz show Jeopardy! In this article we describe a Jeopardy!-like Russian QA data set collected from the official Russian quiz database Chgk (che ge ka). The data set includes 379,284 quiz-like questions with 29,375 from the Russian analogue of Jeopardy! - "Own Game". We observe its linguistic features and the related QA-task. We conclude about perspectives of a QA competition based on the data set collected from this database.

jeopardy, russian jeopardy, tournament, (17 more...)

arXiv.org Artificial Intelligence

2112.02325

Country:

Europe > Russia (0.14)
Asia > Russia > Ural Federal District > Tyumen Oblast > Tyumen (0.05)
Europe > United Kingdom > England (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Jeopardy! (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)

Add feedback

Space warfare: US, China, and Russia are gearing up for the next frontier of armed conflict

FOX NewsJan-24-2024, 13:00:59 GMT

Arthel Neville welcomes former U.S. Defense Intelligence Officer Rebekah Koffler to discuss the massive global cyberattack that had impacted several federal agencies. The next big war may be fought in space. As the Pentagon is gearing up for a future celestial conflict, so are our chief adversaries, China and Russia. Here's why "Star Wars" is no longer merely a topic of science fiction. The best way to avoid space warfare is to be ready for it. On Dec. 28, Elon Musk's Space X launched into space the Pentagon's highly secretive X-37B Orbital Test Vehicle, an unmanned reusable robotic spacecraft operated by the Air Force, in collaboration with Space Force.

artificial intelligence, conflict, russia, (16 more...)

FOX News

Country:

Europe > Ukraine (0.05)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Asia > Middle East > Iran (0.05)
(19 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
Government > Regional Government > Asia Government (0.72)

Technology: Information Technology > Artificial Intelligence (0.71)

Add feedback

Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs

Roberts, Jonathan, Lüddecke, Timo, Sheikh, Rehan, Han, Kai, Albanie, Samuel

arXiv.org Artificial IntelligenceJan-16-2024

Multimodal large language models (MLLMs) have shown remarkable capabilities across a broad range of tasks but their knowledge and abilities in the geographic and geospatial domains are yet to be explored, despite potential wide-ranging benefits to navigation, environmental research, urban development, and disaster response. We conduct a series of experiments exploring various vision capabilities of MLLMs within these domains, particularly focusing on the frontier model GPT-4V, and benchmark its performance against open-source counterparts. Our methodology involves challenging these models with a small-scale geographic benchmark consisting of a suite of visual tasks, testing their abilities across a spectrum of complexity. The analysis uncovers not only where such models excel, including instances where they outperform humans, but also where they falter, providing a balanced view of their capabilities in the geographic domain. To enable the comparison and evaluation of future models, our benchmark will be publicly released.

experiment, gpt-4v, llava-1, (16 more...)

arXiv.org Artificial Intelligence

2311.14656

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Lower Saxony > Gottingen (0.14)
North America > United States > New York (0.04)
(45 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

English to Arabic machine translation of mathematical documents

Eddahibi, Mustapha, Mensouri, Mohammed

arXiv.org Artificial IntelligenceDec-2-2023

This paper is about the development of a machine translation system tailored specifically for LATEX mathematical documents. The system focuses on translating English LATEX mathematical documents into Arabic LATEX, catering to the growing demand for multilingual accessibility in scientific and mathematical literature. With the vast proliferation of LATEX mathematical documents the need for an efficient and accurate translation system has become increasingly essential. This paper addresses the necessity for a robust translation tool that enables seamless communication and comprehension of complex mathematical content across language barriers. The proposed system leverages a Transformer model as the core of the translation system, ensuring enhanced accuracy and fluency in the translated Arabic LATEX documents. Furthermore, the integration of RyDArab, an Arabic mathematical TEX extension, along with a rule-based translator for Arabic mathematical expressions, contributes to the precise rendering of complex mathematical symbols and equations in the translated output. The paper discusses the architecture, methodology, of the developed system, highlighting its efficacy in bridging the language gap in the domain of mathematical documentation

expression, mathematical expression, translation, (12 more...)

arXiv.org Artificial Intelligence

2312.03753

Country:

Africa > Middle East > Morocco > Souss-Massa Region > Agadir (0.05)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > Ontario > Toronto (0.04)
(5 more...)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Multilingual Event Linking to Wikidata

Pratapa, Adithya, Gupta, Rishubh, Mitamura, Teruko

arXiv.org Artificial IntelligenceJul-16-2022

We present a task of multilingual linking of events to a knowledge base. We automatically compile a large-scale dataset for this task, comprising of 1.8M mentions across 44 languages referring to over 10.9K events from Wikidata. We propose two variants of the event linking task: 1) multilingual, where event descriptions are from the same language as the mention, and 2) crosslingual, where all event descriptions are in English. On the two proposed tasks, we compare multiple event linking systems including BM25+ (Lv and Zhai, 2011) and multilingual adaptations of the biencoder and crossencoder architectures from BLINK (Wu et al., 2020). In our experiments on the two task variants, we find both biencoder and crossencoder models significantly outperform the BM25+ baseline. Our results also indicate that the crosslingual task is in general more challenging than the multilingual task. To test the out-of-domain generalization of the proposed linking systems, we additionally create a Wikinews-based evaluation set. We present qualitative analysis highlighting various aspects captured by the proposed dataset, including the need for temporal reasoning over context and tackling diverse event descriptions across languages.

computational linguistic, dataset, wikidata, (14 more...)

arXiv.org Artificial Intelligence

2204.06535

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(40 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Olympic Games (1.00)
Government > Military (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

Russian tankers going dark raises flags on sanctions evasion

The Japan TimesMar-28-2022, 08:16:06 GMT

Russian tankers carrying oil chemicals and oil products are increasingly concealing their movements, a phenomenon that some maritime experts warn could signal attempts to evade unprecedented sanctions prompted by the invasion of Ukraine. In the week ending March 25, there were at least 33 occurrences of so-called "dark activity" -- operating while onboard systems to transmit their locations are turned off -- by Russian tankers, said Windward Ltd., an Israeli consultancy that specializes in maritime risk using artificial intelligence and satellite imagery. That's more than double the weekly average of 14 in the past year. The dark operations occurred mainly in or around Russia's exclusive economic zone, according to Windward, which conducted the research at Bloomberg's request. The ships engaging in dark activity include vessels connected to big corporations and multinational shipping firms, as well as small businesses, according to Windward.

russian tanker, sanction evasion, windward, (9 more...)

The Japan Times

Country:

North America > United States (1.00)
Europe > Russia (0.31)
Europe > Ukraine (0.26)
(5 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy (0.95)
Transportation (0.74)

Technology: Information Technology > Artificial Intelligence (0.91)

Add feedback